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Abstract 

We consider a remote estimation problem with an energy harvesting sensor and a remote estimator. 
The sensor observes the state of a discrete-time source which may be a finite state Markov chain or a 
multi-dimensional linear Gaussian system. It harvests energy from its environment (say, for example, 
through a solar cell) and uses this energy for the purpose of communicating with the estimator. Due to 
the randomness of energy available for communication, the sensor may not be able to communicate all 
the time. The sensor may also want to save its energy for future communications. The estimator relies on 
messages communicated by the sensor to produce real-time estimates of the source state. We consider the 
problem of finding a communication scheduling strategy for the sensor and an estimation strategy for the 
estimator that jointly minimize an expected sum of communication and distortion costs over a finite time 
horizon. Our goal of joint optimization leads to a decentralized decision-making problem. By viewing 
the problem from the estimator's perspective, we obtain a dynamic programming characterization for the 
decentralized decision-making problem that involves optimization over functions. Under some symmetry 
assumptions on the source statistics and the distortion metric, we show that an optimal communication 
strategy is described by easily computable thresholds and that the optimal estimate is a simple function 
of the most recently received sensor observation. 

I. Introduction 

Many systems for information collection like sensor networks and environment monitoring 
networks consist of several network nodes that can observe their environment and communicate 
with other nodes in the network. Such nodes are typically capable of making decisions, that 

A. Nayyar, T. Basar and V. Veeravalli are with Coordinated Science Laboratory at the University of Illinois at Urbana- 
Champaign {anayyar, basarl , vvv}@illinois . edu 

D. Teneketzis is with the Department of Electrical Engineering and Computer Science, University of Michigan 
teneket@eecs . umich . edu 



March 4, 2013 



DRAFT 



2 

is, they can use the information they have collected from the environment or from other nodes 
to make decisions about when to make the next observation or when to communicate or how 
to estimate some state variable of the environment. These decisions are usually made in a 
decentralized way, that is, different nodes make decisions based on different information. Further, 
such decisions must be made under resource constraints. For example, a wireless node in the 
network must decide when to communicate under the constraint that it has a limited battery life. 
In this paper, we study one such decentralized decision making problem under energy constraints. 

We consider a setup where one sensor is observing an environmental process of interest which 
must be communicated to a remote estimator. The estimator needs to produce estimates of the 
state of the environmental process in real-time. We assume that communication from sensor to 
the estimator is energy consuming. The sensor is assumed to be harvesting energy from the 
environment (for example, by using a solar cell). Thus, the amount of energy available at the 
sensor is a random process. Given the limited and random availability of energy, the sensor has 
to decide when to communicate with the estimator. Given that the sensor may not communicate 
at all times, the estimator has to decide how to estimate the state of the environmental process. 
Our goal is to study the effects of randomness of energy supply on the nature of optimal 
communication scheduling and estimation strategies. 

Communication problems with energy harvesting transmitters have been studied recently 
(see flU, A3 and references therein). In these problems the goal is to vary the transmission 
rate/power according to the energy availability in order to maximize throughput and/or to 
minimize transmission time. In our problem, on the other hand, the goal is to jointly optimize 
the communication scheduling and the estimation strategies in order to minimize an accumulated 
communication and estimation cost. Problems of communication scheduling and estimation with 
a fixed bound on the number of transmissions, independent identically distributed (i.i.d.) sources 
and without energy harvesting have been studied in [3] and [31, where scheduling strategies are 
restricted to be threshold based. A continuous time version of the problem with Markov state 
process and a fixed number of transmissions is studied in [Q. In J6]|, the authors find an optimal 
communication schedule assuming a Kalman-like estimator. Remote estimation of a scalar linear 
Gaussian source with communication costs has been studied in 0, where the authors proved that 
a threshold based communication schedule and a Kalman-like estimator are jointly optimal. Our 
analytical approach borrows extensively from the arguments in and 0]. The latter considered 
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a problem of paging and registration in a cellular network which can be viewed as a remote 
estimation problem. 

Problems where the estimator decides when to query a sensor or which sensor to query have 
been studied in [|9]|, iflOl . ifiTll . Ifl2ll . lfT3l . In these problems, the decision making is centralized. 
Our problem differs from these setups because the decision to communicate is made by the 
sensor that has more information than the estimator and this leads to a decentralized decision 
making problem. 

In order to appreciate the difficulty of joint optimization of communication and estimation 
strategies, it is important to recognize the role of signaling in estimation. When the sensor 
makes a decision on whether to communicate or not based on its observations of the source, 
then a decision of not to communicate conveys information to the estimator. For example, if the 
estimator knows that the sensor always communicates if the source state is outside an interval 
[a, b], then not receiving any communication from the sensor reveals to the estimator that the 
state must have been inside the interval [a, b}. Thus, even if the source is Markov, the estimator's 
estimate may not simply be a function of the most recently received source state since each 
successive "no communication" has conveyed some information. It is this aspect of the problem 
that makes derivation of jointly optimal communication and estimation strategies a difficult 
problem. 

A. Notation 

Random variables are denoted by upper case letters (X, T, IT, 6), their realizations by the corre- 
sponding lower case letters (x, 7, tt, 9). The notation X a -j, denotes the vector (X a , X a+ %, . . . , 
Bold capital letters X represent random vectors, while bold small letters x represent their 
realizations. P(-) is the probability of an event, E(-) is the expectation of a random variable. 
Ia(-) is the indicator function of a set A. Z denotes the set of integers, Z + denotes the set of 
positive integers, R is the set of real numbers and R n is the n- dimensional Euclidean space. 
I denotes the identity matrix. For two random variables (or random vectors) X and Y taking 
values in X and y, ~P(X = x\Y) denotes the conditional probability of the event {X = x} 
given Y and P(X|Y) denotes the conditional PMF (probability mass function) or conditional 
probability density of X given Y. These conditional probabilities are random variables whose 
realizations depend on realizations of Y. 
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B. Organization 

In Section HH we formulate our problem for a discrete source. We present a dynamic program 
for our problem in Section HIT] This dynamic program involves optimization over a function space. 
In Section [IV] we find optimal strategies under some symmetry assumptions on the source and 
the distortion function. We consider the multi-dimensional Gaussian source in Section |V] We 
present some important special cases in Section |VI] We conclude in Section IVIIl We provide 
some auxiliary results and proofs of key lemmas in Appendices A to E. This work is an extended 
version of [H"4l . 

II. Problem Formulation 

A. The System Model 

Consider a remote estimation problem with a sensor and a remote estimator. The sensor 
observes a discrete-time Markov process X t , t = 1, 2, . . .. The state space of this source process 
is a finite interval X of the set of integers Z. The estimator relies on messages communicated 
by the sensor to produce its estimates of the process X t . The sensor harvests energy from its 
environment (say, for example, through a solar cell) and uses this energy for communicating 
with the estimator. Let E t be the energy level at the sensor at the beginning of time t. We assume 
that the energy level is discrete and takes values in the set £ = {0,1,...,!?}, where B e Z + . 
In the time-period t, the sensor harvests a random amount N t of energy from its environment, 
where N t is a random variable taking values in the set J\f C Z + . The sequence N t , t = 1,2,..., 
is an i.i.d. process which is independent of the source process X t , t = 1,2,.... 

We assume that a successful transmission from the sensor to the estimator consumes 1 unit of 
energy. Also, we assume that the sensor consumes no energy if it just observes the source but 
does not transmit anything to the estimator. At the beginning of the time period t, the sensor 
makes a decision about whether to transmit its current observation and its current energy level 
to the estimator or not. We denote by U t G {0, 1} the sensor's decision at time t, where U t — 
means no transmission and U t = 1 means a decision to transmit. Since the sensor needs at least 
1 unit of energy for transmission, we have the constraint that U t < E t . Thus, if E t = 0, then 
U t is necessarily 0. The energy level of the sensor at the beginning of the next time step can be 
written as 

Et+i = mm{E t + N t - U u B}, (1) 
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where B is the maximum number of units of energy that the sensor can store. The estimator 
receives a message Y t from the sensor where 



where e denotes that no message was transmitted. The estimator produces an estimate X t at time 
t depending on the sequence of messages it received so far. The system operates for a finite 
time horizon T. 

B. Decision Strategies 

The sensor's decision at time t is chosen as a function of its observation history, the history of 
energy levels and the sequence of past messages. We allow randomized strategies for the sensor 
(see Remark 1). Thus, at time t, the sensor makes the decision U t — 1 with probability p t where 



The constraint U t < E t implies that we have the constraint that p t = if E t = 0. The 
function f t is called the decision rule of the sensor at time t and the collection of functions 
f = {/i, f 2 , . . . , fr} is called the decision strategy of the sensor. 
The estimator produces its estimate as a function of the messages, 



The function g t is called the decision rule of the estimator at time t and the collection of functions 
g — {git 92: ■ ■ ■ j 9t} is called the decision strategy of the estimator. 

C. The Optimization Problem 

We have the following optimization problem. 

Problem 1. For the model described above, given the statistics of the Markov source and the 
initial energy level E\, the statistics of amounts of energy harvested at each time, the sensor's 
energy storage limit B and the time horizon T, find decision strategies f , g for the sensor and 
the estimator, respectively, that minimize the following expected cost: 




Pt = ft(X 1:t ,E 1:t ,Y 1:t _ 1 ) 



(3) 



x t = g t {Yv.t) 



(4) 



T 




(5) 
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where c > is a communication cost and p : X x X i-> R w a distortion function. 

Remark 1. 7f can Z?e argued that in the above problem, sensor strategies can be assumed to be 
deterministic (instead of randomized) without compromising optimality. However, our argument 
for characterizing optimal strategies makes use of the possibility of randomizations by the sensor 
and therefore we allow for randomized strategies for the sensor. 

Discussion of Our Approach: Our approach for Problem [T] makes extensive use of ma- 
jorization theory based arguments used in [J8j and [[7). As in 10, we first construct a dynamic 
program for Problem HI by reformulating the problem from the estimator's perspective. This 
dynamic program involves minimization over a function space. Unlike the approach in [8], we 
use majorization theory to argue that the value functions of this dynamic program, under some 
symmetry conditions, have a special property that is similar to (but not the same as) Schur- 
concavity [fT5l . We then use this property to characterize the solution of the dynamic program. 
This characterization then enables us to find optimal strategies. In Section |Vj we consider the 
problem with a multi-dimensional Gaussian source. We extend our approach for the discrete case 
to this problem and, under a suitable symmetry condition, we provide optimal strategies for this 
case as well. While the result in is only for scalar Gaussian source without energy harvesting, 
our approach addresses multi-dimensional source and energy harvesting. Finally, in Section |VH 
we mention a few special cases which include the important remote estimation problems where 
the sensor can afford only a fixed number of transmissions or where the sensor only has a 
communication cost but no constraint on the number of transmissions. 

III. Preliminary Results 

Lemma 1. There is no loss of performance if the sensor is restricted to decision strategies of 
the form: 

Pt = f t (X t ,E t ,Y lzt - 1 ) (6) 

Proof: Fix the estimator's strategy g to any arbitrary choice. We will argue that, for the 
fixed choice of g, there is an optimal sensor strategy of the form in the lemma. To do so, we can 
show that with a fixed g the sensor's optimization problem is a Markov decision problem with 
X tl E t , Y"i;i_i as the state of the Markov process. It is straightforward to establish that conditioned 
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on X t , E t , Yist-i and p t , the next state (X t+ i, E t+ i, Y 1:t ) is independent of past source states and 
energy levels and past choices of transmission probabilities. Further, the expected cost at time t 
is a function of the state and p t . Thus, the sensor's optimization problem is a Markov decision 
problem with X t , E t , Y^-i as the state of the Markov process. Therefore, using standard results 
from Markov decision theory [[T6l . it follows that an optimal sensor strategy is of the form in 
the lemma. Since the structure of the sensor's optimal strategy is true for an arbitrary choice of 
g, it is also true for the globally optimal choice of g. This establishes the lemma. ■ 
In the following analysis, we will consider only sensor's strategies of the form in Lemma [T] 
Thus, at the beginning of a time instant t (before the transmission at time t happens), the sensor 
only needs to know X t , E t and Y 1:t _i, whereas the estimator knows Y 1:t _i. Problem Q] - even 
with the sensor's strategy restricted to the form in Lemma [TJ is a decision-problem with non- 
classical information structure [fTTl . One approach for addressing such problems is to view them 
from the perspective of a decision maker who knows only the common information among the 
decision makers [18J. In Problem [Q at the beginning of time t, the information at the sensor is 
(X t , E tl Yiit-i), while the information at the estimator is Y\- t -i. Thus, the estimator knows the 
common information (Y^t-i) between the sensor and the estimator. We will now formulate a 
decision problem from the estimator's point of view and show that it is equivalent to Problem [Q 

A. An Equivalent Problem 

We formulate a new problem in this section. Consider the model of Section [III At the end of 
time t — 1, using the information F 1:t _ a , the estimator decides an estimate 

*t-i = gt(Y 1:t ^) 

In addition, at the beginning of time t, the estimator decides a. function T t : X x E 1— >■ [0,1], 
using the information Yi-t-i. That is, 

r t = e t (Y 1:t _ 1 ). (7) 

Then, at time t, the sensor evaluates its transmission probability as p t = T t (X t ,E t ). We refer 
to T t as the prescription to the sensor. The sensor simply uses the prescription to evaluate its 
transmission probability. The estimator can select a prescription from the set Q, which is the 
set of all functions 7 from X x £ to [0, 1] such that 7(2, 0) = , Vrc G X. It is clear that any 
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prescription in the set Q satisfies the energy constraint of the sensor, that is, it will result in p t = 
if E t = 0. We call £ := ■ ■ ■ ,£t the prescription strategy of the estimator. Thus, in this 

formulation, the estimator is the only decision maker. This idea of viewing the communication 
and estimation problem only from the estimator's perspective has been used in |[T9ll , (H. A more 
general treatment of this approach of viewing problems with multiple decision makers from the 
viewpoint of an agent who knows only the common information can be found in lfT8l . We can 
now formulate the following optimization problem for the estimator. 

Problem 2. For the model described above, given the statistics of the Markov source and the 
initial energy level E lt the statistics of amounts of energy harvested at each time, the sensor's 
energy storage limit B and the time horizon T, find an estimation strategy g, and a prescription 
strategy £ for the the estimator that minimizes the following expected cost: 

T 

J(£, g) = E{]T cU t + p{X t , X t )}, (8) 
t=i 

Problems \T\ and [2] are equivalent in the following sense: Consider any choice of strategies f , g 
in Problem [H and define a prescription strategy in Problem |2] as 

M*l:t-l) = /t(v,*l:t-l) 

Then, the strategies £, g achieve the same value of the total expected cost in Problem |2] as the 
strategies f , g in Problem [Q Conversely, for any choice of strategies I, g in Problem [2l define a 
sensor's strategy in Problem [T] as 

/ t (-,-,n rt -i) = M^w-i) 

Then, the strategies f , g achieve the same value of the total expected cost in Problem \T\ as the 
strategies £, g in Problem |2] 

Because of the above equivalence, we will now focus on the estimator's problem of selecting 
its optimal estimate and the optimal prescriptions (Problem ©. We will then use the solution of 
Problem [2] to find optimal strategies in Problem CD 

Recall that E t is the sensor's energy level at the beginning of time t. For ease of exposition, we 
define a post-transmission energy level at time t as E' t — E t — U t . The estimator's optimization 
problem can now be described as a partially observable Markov decision problem (POMDP) as 
follows: 
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1) State processes: (X t ,E t ) is the pre-transmission state; (X t ,E' t ) is the post-transmission 
state. 

2) Action processes: T t is the pre-transmission action; X t is the post-transmission action. 

3) Controlled Markovian Evolution of States: The state evolves from (X tl E t ) to (X t ,E' t ) 
depending on the realizations of X t , E t and the choice of pre-transmission action T t . 
The post-transmission state is (X tl E t — 1) with probability T t (X t , E t ) and (X t ,E t ) with 
probability 1 — T t (X t , E t ). The state then evolves in a Markovian manner from (Xt, E' t ) to 
(X t+1 , E t+ i) according to known statistics that depend on the transition probabilities of the 
Markov source and the statistics of the energy harvested at each time. 

4) Observation Process: Y t . The observation is a function of the pre-transmission state and the 
pre-transmission action. The observation is (X t , E t ) with probability T t (X t , E t ) and e with 
probability 1 - T t (X t , E t ). 

5) Instantaneous Costs: The communication cost at each time is a function of the pre-transmission 
state and the pre-transmission action. The communication cost is c with probability T(X t , E t ) 
and with probability 1 — T(X t , E t ). The distortion cost at each time step, p(X t , X t ) is a 
function of the post-transmission state and the post-transmission action. 

The above equivalence with POMDPs suggests that the estimator's posterior beliefs on the 
states are its information states [16J. We, therefore, define the following probability mass func- 
tions (PMFs): 

Definition 1. 1) We define the pre-transmission belief at time t as Il t := ¥(X t , E t \Yi :t -i). 
Thus, for (x, e) G X x E, we have 

U t (x,e) = T(X t = x,E t = e\Y 1:t ^). 

2) We define the post-transmission belief at time t as Q t '■= Pp^s E' t \Yi- t ). Thus, for (x, e) G 
X x £, we have 

e t (x,e) = P(X t = a;,^; = e|y l!t ). 
The following lemma describes the evolution of the beliefs Ii t and Q t in time. 

Lemma 2. The estimator's beliefs evolve according to the following fixed transformations: 

1) II m (x,e) = £*' e *,[PpQ+i = x\X t = x')F(E t+1 = e\E' t = e')Q t (x', e')}. 

e'e£ 

We denote this transformation by Ht+i = Qj +1 (@t)- 
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2) 

s-\ / \ J <W-i} ifY t = (x',e') 

°^' e ) = (l-r t (x,e))n t ( 8 ,e) ., y _ . < 9 > 

I E«/y(l-r t (a!',e'))nt(a/,C) f J * ~~ t 

where <J{ x ',e'— 1} w a degenerate distribution at (x 1 , e' — 1). Wfe denote this transformation 
bye t = Q 2 t (U t ,T t ,Y t ). 

We can now describe the optimal strategies for the estimator. 

Theorem 1. Let it, 9 be any PMF defined on X x £. Define recursively the following functions: 

W t+1 {tt) := 

V t {9) := mmE[p(X t ,a) + W t+1 (U t+1 )\e t = 9] (10) 

where Ht+i = Ql +1 (Qt) (see Lemma^, and 

W t (n) := minE[cl {c/t=1} + V t (Q t )\U t = n,T t = 7] (11) 

where Q t = Q\ (n t , T t , lj) f^ee Lemma |2|). 

For eac/z realization of the post-transmission belief at time t, the minimizer in (flOl) ex/*^ 
anJ gives the optimal estimate at time t; for each realization of the pre-transmission belief, the 
minimizer in (fTT|) exists and gives the optimal prescription at time t. 

Proof: The minimizer in (fTOt exists because X is finite; the minimizer in (fTTT) exists because 
the conditional expectation on the right hand side of (fTTT) is a continuous function of 7 and Q 
is a compact set. The optimality of the minimizers follow from standard dynamic programming 
arguments for POMDPs. ■ 
The result of Theorem [T] implies that we can solve the estimator's problem of finding optimal 
estimates and prescriptions by finding the minimizers in equations (flOl) and (fTT|) in a backward 
inductive manner. Recall that the minimization in equation (fTTI) is over the space of functions 
in Q. This is a difficult minimization problem. In the next section, we consider a special class 
of sources and distortion functions that satisfy certain symmetry assumptions. We do not solve 
the dynamic program but instead use it to characterize optimal strategies of the sensor and 
the estimator. Such a characterization provides us with an alternative way of finding optimal 
strategies of the sensor and the estimator. 
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IV. Characterizing Optimal Strategies 

A. Definitions 

Definition 2. A probability distribution fx on Z is said to be almost symmetric and unimodal 
(a.s.u.) about a point a G Z, if for any k = 0, 1, 2, . . . , 

fi(a + k) > fi(a - k) > fx(a + k + 1) (12) 

If a distribution fx is a.s.u. about and fi{x) = fi{—x), then fx is said to be a.s.u. and even. 
Similar definitions hold if fx is a sequence, that is, fx : Z h-> R. 

Definition 3. We call a source neat if the following assumptions hold: 

1) The a priori probability of the initial state of the source P(Xl) is a.s.u. and even and has 
finite support. 

2) The time evolution of the source is given as: 

X t+1 = X t + Z t (13) 

where Z t ,t = 1, 2, . . . , T — 1 are i.i.d random variables with a finite support, a.s.u. and 
even distribution fx. 

Remark 2. Note that the finite support of the distributions of X\ and Z t and the finiteness of 
the time horizon T imply that the state of a neat source always lies within a finite interval in 
Z. This finite interval is the state space X. 

We borrow the following notation and definition from the theory of majorization. 

Definition 4. Given fx e W 1 , let fx^ = /xp], • • • , fJ>[ n ]) denote the non-increasing rearrange- 
ment of fx with fi[i] > fi[2] > . . . > fX[ n ]. Given two vectors fx and v from W 1 , we say that v 
majorizes fx, denoted by, fx <v, if the following conditions hold: 

k k 

^2 A*W - ^1 ' for\<k<n-\ 

i=i i=i 

n n 

i=i i=i 

We now define a relation R among possible information states and a property R of real- valued 
functions of information states. 
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Definition 5 (Binary Relation R). Let 9 and 9 be two distributions on X x £. We say 9H9 iff: 

(i) For each e G £, #(-, e) -< 9(-, e) 

(ii) For all e G £, 9(-,e) is a.s.u. about the same point x G X. 

Definition 6 (Property R). Let V be a function that maps distributions on X x £ to the set of 

real numbers R. We say that V satisfies Property R iff for any two distributions 9 and 9, 

9R9 =^ V{9) > V{9) 

B. Analysis 

In this section, we will consider Problem \T\ under the assumptions that: 
(Al) The source is neat (see Definition |3]), and 

(A2) The distortion function p(x,a) is either p(x,a) = t{ x ^ a } or p(x,a) — \x — a\ k , for some 
k > 0. 

Throughout the following analysis, we will assume that Assumptions Al and A2 hold. 

Lemma 3. Let 9 be a distribution on X x £ such that for all e G £, #(■, e) is a.s.u. about the 
same point x' G X. Then, the minimum in (flOl) is achieved at x' . 

Proof: Using Lemma [2l the expression in (flQl ) can be written as 

V t (9) := W t+1 (Ql +1 (9)) +mmE[p(X t ,a)\Q t = 9] 

Thus, the minimum is achieved at the point that minimizes the expected distortion function 
p(X t ,a) given that X t has the distribution 9. The a.s.u. assumption of all 9(-,e) about x' and 
the nature of distortion functions given in Assumption A2 implies that x' is the minimizer. ■ 
We now want to characterize the minimizing 7 in (fTTI) . Towards that end, we start with the 
following claim. 

Claim 1. The value functions W t , £=1,2,...T + L and V t ,t — 1,2, . . . ,T, satisfy Property R 

Proof: See Appendix O ■ 
Recall that (fTTj) in the dynamic program for the estimator defines W t as 

Wtfr) := mjnE[cl {f/t=1} + V t {Q t )\U t = vr, 7i = 7] (14) 
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The following lemma is a consequence of Claim [TJ 

Lemma 4. Let n be a distribution on X x 8 such that 7r(-,e) is a.s.u. about the same point 
a G X for all e G 8. Then, the minimum in the definition ofWt(iv) is achieved by a prescription 
7 : X x 8 t-> [0, 1] of the form: 

1 if \x — a\ > n(e, n) 

if\x — a\< n(e, it) 

a(e,7c) if x = a + n(e, n) 
/3(e,7r) if x — a — n(e, ir) 

where for each e G S, a(e, it), ft(e, 7r) G [0, 1], a(e, ir) < {3(e, ir) and n(e, 7r) is a non-negative 
integer. 

Proof: See Appendix iDl ■ 
Lemmas [3] and |4] can be used to establish a threshold structure for optimal prescriptions and a 
simple recursive optimal estimator for Problem |2] At time t = 1, by assumption Al, 111 is such 
that II^-, e) is a.s.u. about for all e G 8. Hence, by Lemma HI an optimal prescription at time 
t — 1 has the threshold structure of (Tl3T >. If a transmission occurs at time t — 1, then the resulting 
post-transmission belief ©i is a delta-function and consequently 6i(-,e),e E 8 are a.s.u. about 
the same point. If a transmission does not happen at time t = 1, then, using Lemma [2] and the 
threshold nature of the prescription, it can be shown that the resulting post-transmission belief 
is such that 6i(-, e), e G 8 are a.s.u. about 0. Thus, it follows that ©i will always be such that 
all 1 (-,e),e G 8 are a.s.u. about the same point and because of Lemma |3] this point will be 
the optimal estimate. Using Lemma[2]and the a.s.u. property of 0i(-, e), it follows that the next 
pre-transmission belief n 2 will always be such that il 2 (-,e),e G 8 are a.s.u. about the same 
point (by arguments similar to those in Lemma [141 in Appendix 0. Hence, by Lemma H an 
optimal prescription at time t = 2 has the threshold structure of <TT5T > . Proceeding sequentially 
as above establishes the following result. 

Theorem 2. In Problem [2] under Assumptions Al and A2, there is an optimal prescription and 
estimation strategy such that 
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1 ) The optimal estimate is given as: 



X t 



X t -i ify t = t 

x ifyt= (x, e) 



(16) 



where X := 0. 

2) The pre-transmission belief at any time t, U. t (-, e) is a.s.u. about X t -i, for all e G S. 

3 ) The prescription at any time has the threshold structure of Lemma El 

As argued in Section IIII-Al Problem [2] and Problem \T\ are equivalent. Hence, the result of 
Theorem |2] implies the following result for Problem [T] 

Theorem 3. In Problem [7] under assumptions Al and A2, there exist optimal decision strategies 
f , g for the sensor and the estimator given as: 



where a = for t = 1, a = ^_ 1 (y 1:t _ 1 ) for t > 1, and ix t = ~P(X t , 

Theorem |3] can be interpreted as follows: it says that the optimal estimate is the most recently 
received value of the source (the optimal estimate is if no source value has been received). 
Further, there is a threshold rule at the sensor. The sensor transmits with probability 1 if the 
difference between the current source value and the most recently transmitted value exceeds 
a threshold that depends on sensor's current energy level and the estimator's pre-transmission 
belief; it does not transmit if the difference between the current source value and the most 
recently transmitted value is strictly below the threshold. 




(17) 



/ t *(x,e,yi :t _i) = < 



1 if\x-a\> n t (e,n t ) 

if \x — a\ < n t (e, 7r t ) 



(18) 



a t (e,7Tt) ifx = a + n t (e,Tr t ) 
k A(e,7r t ) ifx = a-n t (e,7r t ) 
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C. Optimal Thresholds 

Theorem [3] gives a complete characterization of the optimal estimation strategy, but it only 
provides a structural form of the optimal strategy for the sensor. Our goal now is to find the 
exact characterization of the thresholds and the randomization probabilities in the structure of 
optimal strategy of the sensor. We denote the optimal estimation strategy of Theorem |3] by g* 
and the class of sensor strategies that satisfy the threshold structure of Theorem |3] as T . We 
know that the global minimum expected cost is J(f, g*), for some f G J. Any sensor strategy 
f that achieves a cost J(f, g*) < J(F, g*), for all f G T must be a globally optimum sensor 
strategy. 

Given that the strategy for the estimator is fixed to g*, we will address the question of finding 
the best sensor strategy among all possible strategies (including those not in J 7 ). The answer to 
this question can be found by a standard dynamic program (see Lemma [5] below). We denote 
by f* the strategy specified by the dynamic program. We have that J(f, g*) > J(f*, g*), for all 
f, (including those not in J 7 ). Thus, f* is a globally optimal sensor strategy. Further, f* is in 
the set T . Thus the dynamic program of Lemma |5] provides a way of computing the optimal 
thresholds of Theorem |3] 

Lemma 5. Given that the strategy for the estimator is fixed to g*, the best sensor strategy (from 
the class of all possible strategies) is of the form U t = ft(D t , E t ), where D t := X t — <j£_ 1 (Yi :t _ 1 ). 
Further, this strategy is described by the following dynamic program: 

J T+ i(v) :=0 

For positive energy levels e > 0, 

J t {d, e) := min{c + E[J t+1 (Z t , min(e -1 + N t , £))], 

p(d) + E[J t+1 (d + Z t , min(e + N t , £))]}, (19) 

where p(d) is 1{<^o} if the distortion metric is p(x,a) = t{ x ^ a } and p(d) is \d\ k if the distortion 
metric is p(x,a) — \x — a\ k . For e > 0, the optimal action for a realization (d,e) of (D t ,E t ) 
is U t = 1 iff Jt(d, e) is equal to the first term in the right hand side of (fT9l) . If e = 0, J t (-, 0) 
is the second term in the right hand side of (fT9l evaluated at e = and the optimal action is 
U t = 0. 
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Proof: Once the estimator's strategy is fixed to g*, the sensor's optimization problem is 
a standard Markov decision problem (MDP) with D t = X t — (7 t *_ 1 (Y 1 . t _ 1 ) and E t as the (two- 
dimensional) state. The result of the lemma is the standard dynamic program for MDPs. ■ 
Consider the definition of Jt(d, e) in ([191) . For a fixed e > 0, the first term on right hand side 
of (fl9l ) does not depend on d, while it can be easily shown that the second term is non-decreasing 
in d. These observations imply that for each e > 0, there is a threshold value of d below which 
Ut — and above which U t = 1 in the optimal strategy. Thus, the f * of Lemma [5] satisfies the 
threshold structure of Theorem |3] Comparing the strategy f * specified by Lemma [5] and the form 
of sensor strategies in Theorem |3] we see that 

1) The thresholds in f* depend only on the current energy level of the sensor and not on 
the pre-transmission belief n t whereas the thresholds in Theorem |3] could depend on both 
energy level and n t . 

2) The strategy f * is purely deterministic whereas Theorem |3] allowed for possible random- 
izations at two points. 

V. Multi-dimensional Gaussian Source 

In this section, we consider a variant of Problem [Q with a multi-dimensional Gaussian source. 
The state of the source evolves according to the equation 

X m = AAX 4 + Z t , (20) 

where X t = (X}, X?, . . . , X?), Z t = {Z\, Z 2 t , . . . , Z r t l ) are random vectors taking values in W\ 
A > is a real number and A is an orthogonal matrix (that is, transpose of A is the inverse of A 
and, more importantly for our purpose, A preserves norms). The initial state Xi has a zero-mean 
Gaussian distribution with covariance matrix s{L, and Zi, Z 2 , Z T _! are i.i.d. random vectors 
with a zero-mean Gaussian distribution and covariance matrix s 2 I- The energy dynamics for the 
sensor are the same as in Problem [TJ 

At the beginning of the time period t, the sensor makes a decision about whether to transmit 
its current observation vector and its current energy level to the estimator or not. The estimator 
receives a message Y t from the sensor where Y t = (X t , E t ), if U t = 1 and Y t = e otherwise. 
The estimator produces an estimate X 4 = (X^, . . . , X™) at time t depending on the sequence of 
messages it received so far. The system operates for a finite time horizon T. 
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The sensor and estimator make their decisions according to deterministic strategies f and g 
of the form U t = ft{X t , E t , Y i:t _i) and X t = <7 t (Y 1:t ). We assume that for any time and any 
realization of past messages, the set of source and energy states for which transmission happens 
is an open or a closed subset of W 1 x E. We have the following optimization problem. 

Problem 3. For the model described above, given the statistics of the Markov source and the 
initial energy level E\, the statistics of amounts of energy harvested at each time, the sensor's 
energy storage limit B and the time horizon T, find decision strategies f , g for the sensor and 
the estimator that minimize the following expected cost: 

T 

J(f , g) = cU t + \\X t - X t || 2 }, (21) 

t=i 

where c > is a communication cost and ||-|| is the Euclidean norm. 

Remark 3. Note that we have assumed here that the sensor is using a deterministic strategy 
that employs only the current source and energy state and the past transmissions to make the 
decision at time t. Using arguments analogous to those used in proving Lemma [7] it can be 
shown that this restriction leads to no loss of optimality. While randomization was used in our 
proofs for the problem with discrete source (Problem^, it is not needed when the source state 
space is continuous. 

Definition 7. A function v : R n h->- R is said to be symmetric and unimodal about a point 
a G R n , if ||x — a|| < ||y — a|| implies that z/(x) > v(y). Further, we use the convention that a 
Dirac-delta function at a is also symmetric unimodal about a 

For a Borel set A in R™, we denote by C(A) the Lebesgue measure of A. 

Definition 8. For a Borel set A in R™, we denote by A° the symmetric rearrangement of A. That 
is, A a is an open ball centered at whose volume is C(A). Given an integrable, non-negative 
function h : R n i— )■ R, we denote by h? its symmetric non-decreasing rearrangement. That is, 

PCX) 

^( X ) = / l{aeIR"|/i(a)>t} CT ( X )^ 
JO 

Definition 9. Given two integrable, non-negative functions hi and h 2 from R n to R, we say that 
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h% majorizes h 2 , denoted by, h 2 -< h\, if the following holds: 



> 



(22) 




The condition in (|22j) is equivalent to saying that for every Borel set B C M. n , there exists another 
Borel set B' C M n swc/j ffta? £(B) = £(B') an J /„ h 2 (x)dx < J w h^dx. 

Following the arguments of Sections HIT] and [TV] we can view the problem from the estimator's 
perspective who at each time t selects a prescription for the sensor before the transmission and 
then an estimate on the source after the transmission. Because we have deterministic policies, 
the prescriptions are binary-valued functions. We can define at each time t, the estimator's pre- 
transmission (post-transmission) beliefs as conditional probability densities on IR n x 8 given the 
transmissions Y 1:t _i (Y 1:t ). 

Lemma 6. The estimator's beliefs evolve according to the following fixed transformations: 



1) II m (x,e) = A-"/ x , eRn £ e , e£ [P(£*+i = e\E' t = e')/i(x-x / )0*(A- 1 A- 1 x / , e')], where fi is 



the probability density function ofZ t . We denote this transformation by Ht+i = Q\ +l {Qt)- 



where 5{ x ', e '-i} is <* degenerate distribution at (x',e' — 1). We denote this transformation 



Further, we can establish the following analogue of Theorem \T\ using dynamic programming 
arguments [|20l . 

Theorem 4. Let n, 9 be any pre-transmission and post-transmission belief. Define recursively 
the following functions: 




(23) 



by B t = Q 



f(n t ,r t ,Y,). 



W T+1 (n) := 



V t (9) := inf E[||X t - a|| 2 + W t+ i(n t+1 )|e t = 6} 



(24) 
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where Ht+i = Qt+i(®t) ( see Lemma®, and 

W t (7i) := inf E[cl {Ut=1} + V t (Q t )\U t = 7c,T t = 7] (25) 

where Q is the set of all functions 7 from IR n x £ to {0, 1} such that 7~ 1 ({0}) = IR n x {0} U 
(Llf =1 X e x {e}), where X e is an open or closed subset ofM n . 

Then, Vi(tti), where tx\ is the density of X 1; is a lower bound on the cost of any strategy; 
A strategy that at each time and for each realization of pre-transmission and post-transmission 
belief selects a prescription and an estimate that achieves the infima in ((24]) and (1251) is optimal. 
Further, even if the infimum are not always achieved, it is possible to find a strategy with 
performance arbitrarily close to the lower bound. 

In order to completely characterize the solution of the dynamic program in Theorem |4] we 
define the following relation on the possible realizations of estimator's beliefs. 

Definition 10 (Binary Relation R n ). Let 9 and 9 be two post-transmission beliefs. We say 9H9 
iff 

(i) For each e G £, 9(-, e) -< 9(-, e). 

(ii) For all e G 8, 9(-,e) is symmetric and unimodal about the same point x G X. 
A similar relation holds for pre-transmission beliefs. 

Definition 11 (Property TV 1 ). Let V be a function that maps probability measures on M. n x £ to 
the set of real numbers R We say that V satisfies Property R n iff for any two distributions 9 
and 9, 

9K n 9 V{9) > V{9) 

We can now state the analogue of Claim [Q 

Claim 2. The value functions in Theorem |4] W t t = 1, 2, ... T + 1, and V t ,t = 1,2, ... ,T, 
satisfy Property R n . 

Proof: See Appendix El ■ 
Because of Claim 2, we can follow arguments similar to those in Section [IV] to conclude the 
following: At time t — 1, because tx\{-, e) is symmetric unimodal about for all e, it is sufficient 
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to consider symmetric threshold based prescriptions of the form 

< \ J 1 if l|x|1 - r *( e ' 7Fl ) 
7(x, e) = < (26) 

I if ||x|| < r t (e, 7Ti) 

in right hand side of equation (|24l) for time t = 1. Using such prescriptions implies that 6i(-, e) 
is always symmetric unimodal about some point a which is the optimal estimate in (|25T) at 
time t — 1. Further, 7r 2 (-,e) will also be symmetric unimodal about AAa and therefore it is 
sufficient to restrict to symmetric threshold based prescriptions in (|25l) at time t = 2. Proceeding 
sequentially till time T allows us to conclude that at each time, we only need to consider pre and 
post transmission beliefs that are symmetric unimodal, prescriptions that are symmetric threshold 
based and estimates that are equal to the point about which the belief is symmetric. Then, we 
can conclude the following result. 



Theorem 5. In Problem \3\ it is without loss of optimalitya to restrict to strategies f * , g* that 
are given as: 

, AAa ify t = e 
9*t(yi:t) ={ (27) 
x if y t = (x, e) 

, 1 if llx — AAall > r t (e, Tv f ) 
/;(x,e,y 1:t _ 1 )= { " - ^ (28) 

if ||x — AAa|| < r t (e, ir t ) 

where a = for t = 1, a = c/ t *_ 1 (yi : t-i) for t > 1, ir t = F(X t , E t \yut-i), and r t (e, n t ) > 0. 

Further, the optimal values of thresholds can be obtained by the following dynamic program 
which is similar to the dynamic program in Lemma |5] 

Lemma 7. Given that the strategy for the estimator is fixed to g*, the best sensor strategy (from 
the class of all possible strategies) is of the form U t = / t *(D t , E t ), where D t '■= X t — g*_ l (Yi. t _i). 
Further, this strategy is described by the following dynamic program: 

J T+1 (-,-):=0 

'That is, there is a strategy of the form in the theorem whose performance is arbitrarily close to the lower bound Vi(7ri) 
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For positive energy levels e > 0, 

J t (d, e) := min{c + E[J t+1 (Z t , min(e - 1 + N t , B))}, 

||d|| 2 + E[J m (d + Z t , min(e + N t , B))]}, (29) 

For e > 0, £/ze optimal action for a realization (d, e) o/ (D t , i£ t ) 15 U t — 1 iff J*(d, e) 15 equal 
to the first term in the right hand side of (|29l) . If e = 0, Jt(-, 0) w second term in the right 
hand side of (|29|) evaluated at e = and ?/ze optimal action is U t = 0. 

VI. Special Cases 

By making suitable assumptions on the source, the energy storage limit B of the sensor and 
statistics of initial energy level and the energy harvested at each time, we can derive the following 
special cases of Problem \T\ in Section [XT] and Problem |3] in Section |V] 

1) Fixed number of Transmissions: Assume that the initial energy level E\ — K (K < B) 
with probability 1 and that the energy harvested at any time is N t = with probability 1 . Under 
these assumptions, Problem \T\ can be interpreted as capturing the scenario when the sensor can 
afford at most K transmissions during the time-horizon with no possibility of energy harvesting. 
This is similar to the model in [|3). 

2) No Energy Constraint: Assume that the storage limit B = 1 and that initial energy level 
and the energy harvested at each time is 1 with probability 1. Then, it follows that at any time t, 
E t = 1 with probability 1. Thus, the sensor is always guaranteed to have energy to communicate. 
Under these assumptions, Problem \T\ can be interpreted as capturing he scenario when the sensor 
has no energy constraints (it still has energy costs because of the term cU t in the objective). 
This is similar to the model in ||7). 

3) I.I.D. Source: The analysis of Sections [IV] and |V] can be repeated if the source evolution is 
assumed to be X t+1 = Z t , where Z t are the i.i.d. noise variables. For i.i.d. sources, the optimal 
estimate is the mean value of the source in case of no transmission. Also, the dynamic program of 
Lemma [5] can be used for finite valued i.i.d. sources by replacing D t with X t and changing (fT9l 
toJ t (d,e) :=min{c + E[J m (X m ,min(e-l + ^, J B))],p(d) + E[J 4+1 (X 4+1 ,min(e + V i , J B))]}. 
A similar dynamic program can be written for the Gaussian source. 
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VII. Conclusion 

We considered the problem of finding globally optimal communication scheduling and esti- 
mation strategies in a remote estimation problem with an energy harvesting sensor and a finite- 
valued or a multi-dimensional Gaussian source. We established the global optimality of a simple 
energy-dependent threshold-based communication strategy and a simple estimation strategy. Our 
results considerably simplify the off-line computation of optimal strategies as well as their on-line 
implementation. 

Our approach started with providing a POMDP based dynamic program for the decentralized 
decision making problem. Dynamic programming solutions often rely on finding a key property 
of value functions (such as concavity or quadratic-ness) and exploiting this property to char- 
acterize the solution. In dynamic programs that arise from decentralized problems, however, 
value functions involve minimization over functions [18] and hence the usual properties of value 
functions are either not applicable or not useful. In such problems, there is a need to find the 
right property of value functions that can be used to characterize optimal solutions. We believe 
that this work demonstrates that, in some problems, majorization based properties related to 
Schur concavity may be the right value function property to exploit. 
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Appendix A 
Lemmas from |0O, Section VI 

A. For the discrete source 

Lemma 8. If fi is a.s.u. and even and £ is a.s.u. about a, then the convolution £ * /j, is a.s.u 
about a. 

Lemma 9. If \i is a.s.u. and even, £ is a.s.u. and £ -< £, then 

B. For the multi-dimensional Gaussian source 

Lemma 10. If fi and v are two non-negative integrable functions on R n and fj, -< v, then 
f Rn yU CT (x)/i(x) < f Rn z/ ,7 (x)/i(x) for any symmetric unimodal function h. 
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Lemma 11. If /i and v are two non-negative integrable functions on W 1 , then J Rn fj,(x)u(x) < 
fmn /U fT (x)z/ fT (x) (This lemma is known as the Hardy Littlewood Inequality [21]). 

Lemma 12. If fx is symmetric unimodal about 0, £ is symmetric unimodal and £ -< £, then 

f * \L -< I * /I. 

Appendix B 
Other Preliminary Lemmas 

Lemma 13. Let hi be a non-negative, integrable functions from W 1 to R such that hi is symmetric 
unimodal about a point a Let hi be a pdf on M. n that is symmetric unimodal about 0. Then, 
hi * h 2 is symmetric unimodal about a. 

Proof: For ease of exposition, we will assume that both hi and h 2 are symmetric unimodal 
about 0. If hi is symmetric unimodal about a non-zero point, then to obtain hi * h 2 we can first 
do a translation of hi so that it is symmetric unimodal about 0, carry out the convolution and 
translate the result back. 

Consider two points x^y such that ||x|| = ||y||. Then, we can always find an orthogonal 
matrix such that y = Qx. Then, 

{hi * h 2 ){y) = (h * /i 2 )(Qx) = ! hi(z)h 2 (Qx - z)dz (30) 

J z 

Carrying out a change of variables so that z = Qz', the above integral becomes 
/ hi(Qz')h 2 (Qx-Qz')dz' = [ hi(Qz')h 2 (Q(*-z'))dz' 

Jz' Jz 1 

= J fri(z> 2 (x - z')dz' = (h * ft 2 )(x) (31) 

where we used the symmetric nature of hi and h 2 and the fact that the orthogonal matrix 
preserves norm. Thus, any two points with the same norm have the same value of hi * h 2 . This 
establishes the symmetry of hi*h 2 . Next, we look at unimodality. We follow an argument similar 
to the one used in [|22l . Because of symmetry, it suffices to show that (hi * h 2 )(xi, 0, 0..., 0) is 
non-increasing for xi E [0, oo). (Here, (xi,0, ...,0) is the n dimensional vector with all but the 
first coordinates as 0.) 



(hi*h 2 )((xi,0,...,0)) 



f h 2 (z)hi((x u 0..,0) -z)dz = E[hi((xi,0..,0) -Z)], (32) 

J z 
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where Z is a random vector with pdf h%. Define a new random variable Y X1 := hi((xi, 0.., 0) — Z). 
Then, 

POO 

E[fci((xi,0..,0) - Z)] = E[Y X1 ] = / F(Y X1 > t)dt (33) 

Jo 

We now prove that for any given t > 0, P(F X1 > t) is non-increasing in ccx. This would imply 
that the integral in (|33l) and hence (/ii*/i 2 )((a; 1 ,0...0)) is non-increasing in x\. 

The symmetric unimodal nature of h\ implies that Y Xl > t if and only if || (xi, 0.., 0) — Z|| < r 
(or ||(xi,0..,0) — Z|| < r) for some constant r whose value varies with t. Thus, 

P(y ai >t) = P(||(x 1 ,0..,0)-Z|| <r) = [ h 2 (z)dz, (34) 

where S(xi, r) is the n-dimensional (open) sphere centered at (xi, 0, ..0) with radius r. It can be 
easily verified that the symmetric unimodal nature of hi implies that as the center of the sphere 
S(xi,r) is shifted away from the origin (keeping the radius fixed), the integral in (|34l) cannot 
increase. This concludes the proof. ■ 

Appendix C 
Proof of Claim 1 

Since Wr+i( 71 ") : = for any choice of tt, it trivially satisfies Property R. We will now proceed 
in a backward inductive manner. 

Step 1: If Wt+i satisfies Property R, we will show that V t satisfies Property R too. 
Using Lemma [2l the expression in (flOT ) can be written as 

V t {0) := W t+l (Ql +1 (9)) +minE[p(X t ,a)\e t = 6] (35) 

We will look at the two terms in the above expression separately and show that each term 
satisfies Property R. To do so, we will use the following lemmas. 

Lemma 14. 6R6 =^ Qj +1 (^)RQ t 1 +i(^)- 

Proof: Let n = Q\ +l {6) and tx = Q\ +l {9). Then, from Lemma |2l 

tt(x, e)=J2 t P ( X m = x \ X t = sW^H-i = e \ E 't = e')0t(x\ e')] 

x'ez, 
e'es 

= J2 [ p (^+i = e \ E 't = e') ^[P(^t+i = a:|^t = x')d(x',e')] 
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= £ [p(E t+1 = e\E[ = e') ^[P(Z t = x - x')6(x', e')] 
e>e£ x'ez 

= J2v(Et+i = e\E' t = e')((x,e'), (36) 

where ((x, e') = J2 x , eZ [P(Zt = x — x')9(x' , e')]. Similarly, 

ff(x, e) = ^ P(£ m = e\E' t = e')((x, e') (37) 

where e') = J^x'ezl^i^t = x — x')9(x', e')]. In order to show that 7r(-,e) -< 7f(-,e), it 

suffices to show that C(') e ~< C('; e ') an d that C(') e ') are a - s - u about the same point for all 
e' G £. It is clear that 

C(-,e')=//*0(-,e'), C(-,e') = /i*^(-,e') 

where \i is the distribution of Z t and * denotes convolution. We now use the result in Lemmas 
[8] and [9] from Appendix A to conclude that p * 9(-, e') -< p * 9(-, e') and that p * 9(-, e') is a.s.u. 
about the same point as #(•, e'). Thus, we have established that for all e E £, 7r(-, e) -< 7r(-, e). 
Similarly, we can argue that 7f(-,e) are a.s.u. about the same point since C(-,e') are a.s.u about 
the same point. Thus, 

9R9 =► Q^RQ^fl). 

■ 

The above relation combined with the assumption that W t +i satisfies Property R implies that 
the first term in (|35l ) satisfies Property R. The following lemma addresses the second term in 
(I35]>. 



Lemma 15. Define L(9) := min ae ^ Ei[p(X t , a)\Q t = 9\. L(-) satisfies Property R. 

Proof: For any a G A\ the conditional expectation in the definition of L(9) can be written 



as 



= \J p(x, a)mx9{x) (38) 

where mx9(x) = J2 e &£ @( x > e ) * s me mar g ma l distribution of 9. Recall that the distortion function 
p(x, a) is a non-decreasing function of \x— a\. Let di be the value of the distortion when \x—a\ = i 
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Let V := {0, di,di, d 2 , d 2 , d 3 , d 3 , ... , d M , d M }, where M is the cardinality of X. It is clear that 
the expression in (|38l) is an inner product of some permutation of V with m^ft For any choice 
of a, such an inner product is lower bounded as 

P(x, a)m x 6(x) > (V t , m x ^) , (39) 

which implies that 

L(0) > (V^mxed, (40) 

where (•, •) represents inner product, is the non-decreasing rearrangement of V and mxOi is 
the non-increasing rearrangement of rrixO. If 6H6, then it follows that mxO -< mxO and rrixO 
is a.s.u. about some point b & X. It can be easily established that mxO -< mxQ implies that 

(D t ,mx0 ; ) > {V^mxk) ( 41 ) 
Further, since m x 9 is a.s.u. about b, ^2 xeX p(x, b)m x 0(x) = (V^,mx0^). Thus, 

L(6) = (V^mxh) (42) 

Combining (|40l . (SB and (g2]> proves the lemma. ■ 
Thus, both terms in (|35l) satisfy Property R and hence V t satisfies Property R. 
Step 2: If V t satisfies Property R, we will show that W t satisfies Property R too. 

Consider two distributions n and n such that 7rR7f. Recall that (PTTI) defined W t (Tr) as 

W t (ir) = minE[cl {(/t =i } + V t (Q t )\Tl t = v,<y t = 7] =: minW(7r, 7 ) (43) 

7 7 

where W(tt, 7) denotes the conditional expectation in (|43~T) . Suppose that the minimum in the 
definition of W t (n) is achieved by some prescription 7, that is, Wt(7r) = W(7r, 7). Using 7, 
we will construct another prescription 7 such that W(-7f, 7) < W(tt, 7). This will imply that 
Wt(7r) < Wt(7r), thus establishing the statement of step 2. We start with 

w(tt,7) = E[d {Ut=1} + v t (e t )\n t = 7i, lt = 7] 

= cP{u t = i|n t = 7T, 7 t = 7) + E[K(e t ) |n t = tt, 7t = 7 ] 

= c ^ 7r(x, e) 7 (x, e) + E[F t (Q t 2 (7r, r t , 7 ))|n t = tt, lt = 7] (44) 

2,e 
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The second term in (1441) can be further written as 

F(Y t = e\Tl t = 7T, lt = 7) x [V t (Q 2 t (ir, Y t = e, 7 ))|II t = tt, 7t = 7]+ 

£)[P(T t = (x, e )|n t = 7T, 74 = 7) x [vKQ t 2 (7r,y t = (x, e ) j7 ))|n t = n, lt = 7 ]] 

= ]T 7r(x', e')(l - 70', e')) x V^ 7 ) + ]T Trfo e) 7 (x, e)V^(5(x, e - 1)) (45) 

x',e' x,e 

where 1 is the distribution resulting from it and 7 when F t = e (see Lemma [2]). Substituting 
( 1451 ) in (1441) gives the minimum value to be 

c ^ n(x, e) 7 (x, e) + ^ tt(x, e) 7 (x, e)^,^)) + ^ 7r(x', e')(l - 7 (x', e')) x K(0 7 ) (46) 

x,e x,e x',e' 

We will now use the fact that V t satisfies Property R to conclude that Vi(8u e -\\) does not 
depend on x. That is, Vt(5( x ,e-i)) — K(e — 1), Vx E X, where K(e — 1) is a number that only 
depends on e — 1. Consider 5t x >e _i) and 5( x >, e -i)- It is eas Y to see that 5( x , e -i)R-5(x',e-i) and 
£(^e-i)R<^(a;,e-i)- Since V t satisfies Property R, it implies that Vt(5( x / je — 1)) < ^t(<^(a;,e-i)) an d 
V^(tf(a,, e -i)) < ^(5(^,6-1)). Thus, 14(<J( B ,e_i)) = 14(tf( a ,', e _i)) = if(e - 1). Equation © now 
becomes 

7r(x, e) 7 (x, e) + 7r(x, e) 7 (x, e)K(e — 1) 

+ ]T tt(x', e')(l - 7 (V, e')) x Vi(^) (47) 

We define A(e) := X^e* ^(^ e )(! ~~ l( x > e ))- 

We will now construct another prescription 7. For that matter, we first define the sequence 

S = {0, 1, —1, 2, —2, 3, —3, } and let s(n) denote the n th element of this sequence. Recall 

that 7r(-, e) is a.s.u. about the same point a E X for all e E £. For each e E £, define 



e) := min{n : 7r(a + s(k), e) > A(e)} 
fc=i 

A(e)-E5 H *(« + #),e) 



n 

fc=i 

and 

■\n*(e)— 1 — 

7r(a + s(n*(e)J, ej 

Define 7 (-, •) as 

iffc<n*(e) 



7 (o + s(fc),e) 



(l-a(e)) iffc = n*(e) (48) 
1 if > n*(e) 
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We can show that with the above choice of 7, 

J2 n{x, e)(l - 7(x, e)) = ^ frfo e)(l - 7(2;, e)). (49) 

X X 

and 

tt(x, e)y(x, e) = ^2 ^( x > e )l( x > e )- (50) 

x a: 

Using the same analysis used to obtain (PTTl) . we can now evaluate the expression 

E[cl {[7t=1} + \4(9 t )|n t = ^r, 7 t = 7] 

to be 

c ^ 7f(x, e) 7 (x, e) + ^ if(x, e)7(x, e)AT(e - 1) + ^ ^( x '> e ')(! ~ 7^', e')) X W^), (51) 

where 6 1 is the distribution resulting from tt and 7 when If = e (see Lemma [2]). Using (l50l) in 
(IBTI) . we obtain the expression 

7r(x, e)j(x, e) + 7r(x, e)j(x, e)K(e — 1) 

+ J2 ^ e ')(l - 7(x', e')) x V^), (52) 

Comparing (|47b and (|52l . we observe that all terms in the two expressions are identical except 
for the last term V t (-). Using the expressions for 9 1 and 6^ from Lemma |2] and the fact that 
nUn, it can be shown that 6> 7 R6^. Thus, V t ((P) < Vi(6> 7 ). This implies that the expression in 
(1521) is no more than the expression in (|47| ). This establishes the statement of Step 2. 

Appendix D 
Proof of Lemma 0] 

Suppose that the minimum in the definition of W t (7r) is achieved by some prescription 7. 
Using 7, we will construct another prescription 7 of the form in (TT3T) which also achieves the 
minimum. The construction of 7 is identical to the construction of 7 in Step 2 of the proof of 
Claim 1 (using n instead of n to define n*(e), a(e)). The a.s.u. assumption of n and the nature 
of constructed 7 imply that 7 is of the form required in the Lemma. 
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Appendix E 
Proof of Claim 2 

The proof follows a backward inductive argument similar to the proof of Claim 1 . 

Step 1: If Wt+i satisfies Property R n , we will show that V t satisfies Property R n too. 
Using Lemma [6] the expression in ([24l can be written as 

V t (9) := W t+1 (Q] +1 (9)) + inf E\p(X t ,a)\G t = 9} (53) 

We will look at the two terms in the above expression separately and show that each term 
satisfies Property R n . 

Lemma 16. 9R n 9 Ql +1 (9)R n Ql +1 (9). 

Proof: Let n = Q] +1 (9) and n = Q] +1 (9). Then, following steps similar to those in proof 
of Claim 1, 

tt(x ; e) = W+i = e\E' t = e')C(x, e'), (54) 

e'e£ 

where C(x,0 = A - " J x / eM „ [M x — x')^(A _1 A _1 x', e')]. Similarly, 

7f(x, e) = P (^+i = e \ E 't = e ')C(x, e') (55) 

e'e£ 

where ((x,e') = \~ n J x , gRn [/i(x — x')^(A _1 A _1 x', e')]. In order to show that 7r(-,e) -< 7r(*, e), 
it suffices to show that A n £(-, e') -< A n C(-, e') and that ((•, e') are symmetric unimodal about the 
same point for all e' E £. It is clear that 

A*C0, ef)=fi* r,(; e'), A"C(", e') = /i * ^(., e') 

where 77 (x, e') = ^(A^A^x, e') and r)(x, e') = 6'(A _1 A _1 x, e'). Recall that 9(-, e) -< 0(-, e) and 
that §(•, e) is symmetric unimodal about a point. It can then be easily shown, using the orthogonal 
nature of matrix A, that r)(-,e) -< fj(-,e) and that fj(-,e) is symmetric unimodal about a point. 
We now use the result in Lemmas \TZ\ and [13] to conclude that /i * r](-, e') -< \x * fj(-, e') and that 
pi * fj(-,e') is symmetric unimodal about the same point as fj(-,e'). Thus, we have established 
that for all e G £, 7r(-, e) -< 7f(-, e). 

To prove that #(•, e) is symmetric and unimodal about the same point it suffices to show that 
e') are symmetric and unimodal about the same point. Since £(•, e') is convolution of r)(-, e') 
and /i, its symmetric unimodal nature follows from Lemma [13] ■ 
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Lemma 17. Define L{9) : = inf agK n E[||X t — a|| 2 |0 t = 6]. L(-) satisfies Property R n . 

Proof: Let 9H n 9 such that #(•, e) is symmetric unimodal about b for all e. For any a G M 71 , 
the conditional expectation in the definition of L{6) can be written as 



W ||x-a|| 2 0(x,e) 
eee > /xeRn 



(56) 



Consider any e with positive probability under 9 (that is, J xeR „ 6>(x, e) > 0). For a constant 
c > 0, consider the function z/ c (x) = c — min{c, ||x — a|| 2 }. Then, 

[ i/ c (x)0(x, e) < / <(x)^ (x, e) = / (c - min{c, ||x|| 2 })r (x, e) (57) 

JxGR n 7x6K™ 7xeR™ 

where we used Lemma ITTI in (|57l . Using the fact that 6>(-, e) -< #(■, e) and Lemma [TOl we have 



[c - min{c, ||x|| 2 })# CT (x, e) < (c - min{c, ||x|| 2 })^ (x, e) 

(c-min{c, ||x - b|| 2 })0(x, e), (58) 



where b is the point about which 9 is symmetric unimodal. Therefore, for any aGi 

(c-*{c,||x-af})»(x,e)</ (c - min{c, ||x - b||*})*(x, e) 



min{c, ||x-a|| 2 })6>(x,e) > / (min{c, ||x - b|| 2 })#(x, e) (59) 

xeM n TxeR™ 

As c goes to infinity, the above inequality implies that 

(||x-a|| 2 )0(x,e) > / (||x-b|| 2 )0(x, e ) (60) 
Summing up (|60l) for all e establishes that 

V / ||x - a|| 2 0(x, e)>J2 [ llx - b|| 2 0(x, e), (61) 

Taking infimum over a in the LHS of the above inequality proves the lemma. ■ 
Thus, both terms in (1351) satisfy Property R n and hence satisfies Property R n . 
Step 2: If Vt satisfies Property R, we will show that W t satisfies Property R too. 

Consider two distributions n and n such that 7rR-7f and tx is symmetric unimodal about b. Recall 

that (US) defined W t (ir) as 

W t (n) = ME[ct {Ut=1} + V t (Q t )\U t = n, lt = 7] =: inf W(vr, 7 ) (62) 
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For any 7, we will construct another prescription 7 such that W(tt, 7) < W(tt, 7). This will 
imply that Wt(7f) < Wt(7r), thus establishing the statement of step 2. We start with 

w(7r, 7 ) = E[d {Ut=1} + v t (e t )\n t = 7r, 7t = 7 ] 

= c^ / 7r(x,e)7(x,e) + E[^(g, 2 (7r,yi,7))|n t = 7T, 7 i = 7] (63) 

e ^ x 

The second term in (|63l can be further written as 

= X)/ vr(x', e ')(l-7(x , ,e'))x^(^) + ^ / tt(x, e) T (x, e)W(x, e - 1)) (64) 

e' e ^ x 

where # 7 is the distribution resulting from n and 7 when = e (see Lemma [6]). Substituting 
(1641 in ([63]) and using the fact that V t (5(x, e - 1)) = K(e - 1) gives 

7r ( x ' e )7( x ) e) + ^2 / 7r(x,e)7(x,e)ZT(e - 1) 
e " /x e - /x 

+ ]T / 7 r(x',e')(l-7«e / )) x W 7 ) (65) 

e' ^ x ' 

We define A(e) := J x 7r(x, e)(l — 7(x, e)). We construct 7 as follows. Define r > to be the 
radius of an open ball centered at b such that f, b ,, <r 7r(x, e) = A(e). Then, define 

f if llx- bll < r 
7(x,e)={ (66) 
I 1 otherwise 

Using the expressions for 6> 7 and 6* 7 from Lemma |6] and the fact that 7rR,7r, it can be shown that 
6> 7 R# 7 . This establishes the result of Step 2. 
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